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Abstract 

Interference matrix (IM) has been widely used in frequency planning/optimization of cellular 
systems because it describes the interaction between any two cells. IM is generated from the source data 
gathered from the cellular system, either mobile measurement reports (MMRs) or drive test (DT) records. 
IM accuracy is not satisfactory since neither MMRs nor DT records contain complete information on 
interference and traffic distribution. In this paper, two IM generation algorithms based on source data 
fusion are proposed. Data fusion in one algorithm is to reinforce MMRs data, using the frequency- 
domain information of DT data from the same region. Data fusion in another algorithm is to reshape 
DT data, using the traffic distribution information extracted from MMRs from the same region. The 
fused data contains more complete information so that more accurate IM can be obtained. Simulation 
results have validated this conclusion. 

Index Terms 

frequency plan, interference matrix, data fusion, mobile measurement reports, drive test records 
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I. INTRODUCTION 

Frequency planning is of great importance to the large-scale cellular systems. A set of frequen- 
cies is allocated among thousands of cells/cell-clusters in order to satisfy the capacity demand of 
each cell. Limited frequencies are reused over non-overlapped space and the inter-cell interference 
is avoided owing to the frequency separation. In contrast to dynamic frequency allocation for 
small-scale wireless networks or cognitive wireless networks, a frequency plan generated using 
a centralized algorithm will be employed several months in a large-scale cellular system until 
there is an obvious change of wireless environment or capacity demand. 

Frequency planning for large-scale cellular systems has been a hot research issue for decades. 
Early research focused on the frequency assignment to minimize the number of frequencies 
needed while satisfy capacity demand and frequency separation constraint [1-12]. Here frequency 
assignment is formulated as an optimization problem, and the constraint of frequency separation 
between cell pairs is characterized by a compatibility matrix (also known as channel separation 
matrix) or exclusion matrix. 

In recent years, another type of practical frequency planning/optimization has drawn more 
attentions. It aims to minimize the intra-system interference through making full use of the 
available frequency band owned by the network operator [13-17] under the condition of actual 
wireless environment. Interference matrix (IM) is employed to describe the interaction between 
any two cells in a cellular system. An element of IM is an indicator of the potential interference 
intensity between the two cells assuming that the two cells operate at the same frequency. The 
element value depends on the distance and propagation condition between the two cells, the base 
station parameters and traffic distribution of the cells. 

The accuracy of an IM mainly depends on its source data. Source data comes from one of 
the two categories: model-generated and system measurement. The models used to generate 
source data include propagation model [[TBI . |[T4l such as Okumura-Hata and ray-tracing models 
[fT51 . These model-generated data do not reflect the real interaction between practical cells, thus 
resulting in impractical IM. System measurements are obtained during the operation of actual 
cellular system. The two most commonly used system measurements are (1) mobile measurement 
reports (MMRs) sent by active user terminals to the base transceiver station (BTS) in a dedicated 
mode [18-20], and (2) drive test arranged for routine system optimization [fT6ll . rTTVll . 
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MMRs were originally designed for seamless handover between cells. Mobile phones send 
MMRs to the BTS every 480ms, which contain 7 signal strength measurements from the serving 
cell and 6 neighboring cells respectively. The received signal strengths from these 6 neighboring 
cells are the strongest among those of all the cells in BA (BCCH Allocation) list. For simplicity of 
presentation, these 6 neighboring cells are called "strongest neighboring cells" in the following. 
A BA list contains limited number of neighboring cells of the serving cell, such as 32 for GSM, 
and is preset manually by network operator. Being used in frequency planning/optimization, these 
signal strength measurements are transformed to MMRs data reflecting the wireless environment 
that user experienced, and the number of MMRs is a measure of traffic. As a result, IM 
generated from MMRs gives larger weight to the region with heavier traffic, and the follow- 
up frequency planning/optimization can pay more attentions to the region. However, some 
of the neighboring cell signals might be omitted from MMRs due to outdated BA list or 
undistinguishable cells operating at the same frequency. Although it was concluded through 
simulations that this problem would not degrade the accuracy of the MMRs based IM evidently 
EH, E3, MMRs' incompletion of frequency-domain information is severe in those countries 
and regions where the cellular network is adjusted frequently due to network's fast expanding, 
traffic's explosive increase, GSM900 and DCS 1800 co-existence, and transition from 2G to 3G. 
This is caused by several reasons: (1) BA list update is not in time; (2) part of the 6 measurements 
might wrongly report the signal strengths of other frequency bands (such as DCS 1800) or other 
systems (such as TD-SCDMA or WCDMA), so that less than 6 measurements are available for 
the system itself (such as GSM900). Some of the elements in MMRs based IM might be wrongly 
set to zero as a result of incomplete information on the potential interfering neighboring cells. 

Drive test (DT) is done by the frequency sweeper carried on a vehicle traveling along roads, 
for the purpose of obtaining good knowledge of the radio environment. DT records are of high 
accuracy and consist of measurements of all the frequencies over large frequency band. The 
number and distribution of DT records depend on the vehicle velocity and road situation, thus 
are uselessless to frequency planning/optimization. As a source data of IM, DT data transformed 
from DT records offers complete information of potential interfering neighboring cells, but it 
can not reveal the actual communication traffic distribution. 

In this paper, two IM generation algorithms based on source data fusion are proposed. The 
fused data contains more complete information so that more accurate IM can be obtained. The 
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steps of these two algorithms are described as follow. First, DT data of the serving cell is 
clustered into K clusters according to their frequency-domain information, each cluster has 
unique characteristic and corresponds to a specific geographic region in the cell. Secondly, the 
number of MMRs in each cluster is calculated through a multiple linear regression, which is 
the estimation of the traffic amount generated in the corresponding geographic region. Thirdly, 
a data fusion is carried out to reinforce MMRs data, using the frequency-domain information 
of DT data from the same region; or another data fusion is performed to reshape DT data, 
using the traffic distribution information extracted from MMRs from the same region. Finally, 
the reinforced MMRs data or the reshaped DT data is used to calculate IM. These two proposed 
algorithms are named as MMRs+DT algorithm and as DT+MMRs algorithm, respectively. They 
have the same steps of DT data clustering, MMRs classifying and traffic distribution estimation 
as well as IM calculation, but different data fusions. 

The rest of this paper is organized as follows. In Section II, source data and conventional 
IM generation algorithm are introduced. The proposed MMRs+DT algorithm and DT+MMRs 
algorithm are described in Sections III and IV, respectively. Simulation results are given in 
Section V to verify the improvement in the information completeness by utilizing two source 
data. The main conclusions are drawn in Section VI. 

II. SOURCE DATA AND CONVENTIONAL IM GENERATION ALGORITHM 

For simplicity of presentation and symbolization, the following discussion will focus on the 
scenario of one serving cell with its neighboring cells. It can be easily generalized to normal 
cellular system scenario. 

A. MMRs Data 

Assume that J neighboring cells' signal strength measurements are reported in MMRs. From 
the signal strength measurements reported in MMRs, the carrier-to-interference ratio (CIR) of 
serving cell u and neighboring cell v can be calculated. Assume there are Q successive and 
non-overlapping intervals of CIR value, with interval index q = l,...,Q. 

MMRs data is defined as the number of CIR values falling into each particular CIR interval, 
and is expressed as a /<2-dimensional column vector R = [R^R^ . . . Rj . . . Rj] T - Where each Q- 
dimensional column vector R 7 is the j'th column in Figure 1. The element in the [(j — 1)2 + q]th 
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column of R is denoted as r jA , which is the number of CIR values against neighboring cell j 
falling into CIR interval q. For example, r 3>2 is the number of CIR against neighboring cell 3 
falling into CIR interval 2, and r 3>2 = 100 in Figure 1. 

B. DT Data 

Assume that there are M DT records containing signal strengths from / neighboring cells. DT 
data is expressed as a matrix D = [did 2 . . .d m . . . d M ] where the mth column vector d m consists 
of the CIR values against all the / neighboring cells while the CIR values are calculated from 
the signal strengths in the mth DT record. That is, the element d^ m in the z'th row and mth column 
of D is the CIR value against neighboring cell i, which is calculated from the signal strengths 
in the mth DT record. 

/ and J are usually not equal since both measurement methods and tools of getting MMRs 
and DT records are different. 

C. Generating IM 

As a form of IM, inter cell dependency matrix (ICDM) is widely used owing to its accuracy 
and simplicity [18]. Each element of ICDM is an estimation of the probability of a CIR value 
being lower than a preset threshold C/i? threshold , or equivalently, the probability of a CIR interval 
index being lower than a preset threshold ^threshold, <2threshoid < Q- The procedure of obtaining 
ICDM element generated from MMRs data and DT data are shown in Figure 2 and Figure 3, 
respectively. 

III. IM GENERATION ALGORITHM BASED ON MMRS DATA REINFORCED BY DT 

DATA 

The MMRs data and DT data to be fused should be from the same geographical position or 
region, since only they are the descriptions of the wireless environment there. Therefore, the 
MMRs data and DT data from the same position or region have to be identified and separated 
from the whole MMRs data and DT data first. 

Frequency spectrum consists of the signal strengths of all the signals from all the cells. The 
relative strengths at different frequencies correspond to the position where the MMRs and/or 
DT records were measured. The CIR values against all the neighboring cells can be obtained 
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from the frequency spectrum, and then constitute a "CIR spectrum". CIR spectrum (CIRS) can 
be regarded as the "fingerprint" of a position as it is closely related to the position where the 
wireless signals are measured. CIRS can be derived from DT records, since DT records have 
complete frequency-domain information as addressed in Section I. 

MMRs data is the numbers of CIR values against all the J neighboring cells falling into 
different CIR intervals. Therefore, it is essentially a statistic of CIRS at different positions. 
According to the characteristic of CIRS, MMRs data from different positions can be identified 
and classified by regression analysis, provided that all the CIRSs are known. 

To reduce the number of CIRS types in the serving cell and thus reduce the computational 
complexity of regression analysis, all the M CIRSs derived from DT data are clustered into K 
clusters by clustering analysis, and each cluster is related with a specific region in the serving 
cell. Therefore, MMRs data will be identified and classified by multiple linear regression on the 
basis of cluster. 

The block diagram of this algorithm is shown in Figure 4. The CIRS profile, clustering, 
regression and data fusion will be discussed in this section in turn. 

A. CIRSP and SP Matrix 

CIRS derived from DT data is the "fingerprint" of geographical position. Note that the main 
characteristic of CIRS is from the 6 neighboring cells with the strongest signals, i.e., from the 
so called "the 6 strongest neighboring cells". For data fusion, CIRS profile (CIRSP) is defined 
here. It is transformed from CIRS by reserving the information of the 6 strongest neighboring 
cells and removing those of all the other cells, thus is much simpler than CIRS. The CIRSP 
element corresponding to a neighboring cell is set to q, if the neighboring cell is one of the 
6 strongest neighboring cells and the CIR against this neighboring cell falls into the qth CIR 
interval. The CIRSP elements corresponding to the other neighboring cells are set to zero. The 
CIR intervals mentioned here agree with those in MMRs data calculation in Section II-A. 

An example of the CIRSP is shown in Figure 5. The serving cell is in the center and the serial 
numbers of its neighboring cells are marked. The CIRSPs from locations A and B are plotted 
on the right of the figure. It can be observed that location A is close to the center of serving 
cell and far away from all the neighboring cells so that the CIRs against these neighboring cells 
fall into the Qth CIR interval, resulting in that the CIRSP's elements corresponding to the 6 
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first-ring neighboring cells are Q while the other elements are 0. Similarly, location B is close to 
the edge of the serving cell so that all the CIRs are low; cells 1,2,4,5,6 and 7 are the 6 strongest 
neighboring cells so that the corresponding CIRSP elements are 1,1, Ql, Ql, Ql and Ql, and 
the other CIRSP elements are zero. 

For the purpose of regression analysis in the following subsection, the CIRSP is transformed 
into a spectrum profile (SP) matrix S = [si S2 . . . s m . . . s m ]iqxm whose format matches to that 
of MMRs data. Its rath column s m = [s} m S2 m . . .sj m . ■■ s Jm\ IQxl is called CIRSP vector, and is 
transformed in the following way from the CIRSP derived from the rath DT record . Vector 
S;, m = if the CIRSP element corresponding to neighboring cell i is zero. The qth. element 
of vector s ;>m , which is the element in the [(/ - 1) Q + q]th row and rath column of S and is 
denoted as s^ m , is one while all the other elements of vector s i>m are zero if the CIRSP element 
corresponding to neighboring cell i is q. That is, s I>?>m equals to 1 if the neighboring cell i is one 
of the 6 strongest neighboring cells and d ittn is fall into CIR interval q, it equals to otherwise. 
An example of S is shown in Figure 6. "1" appears at positions (1,2), (2,1) and (3,0 since 
the original CIRSP elements corresponding to neighboring cell 1,2,3 and 4 are 2, 1, Q and 
respectively. 

B. Clustering and Cluster Center 

The M CIRSPs derived from DT data is clustered into K clusters by clustering analysis [23] 
according to CIRSP similarity. Each cluster corresponds to a characteristic region (called region 
for short) in the serving cell. Correspondingly, DT records are classified into the Mi set if their 
CIRSPs belong to the Mi cluster, k=l,...,K. 

Membership vector A, size vector B and cluster center matrix C are obtained through clustering 
analysis. A = [aia 2 . . . a m . . .a M ] lxM , and its element a m = k\i the rath DT record is classified 
into the Mi set. B = [b\b 2 . . - bt . . -b K ]i X K an d bt is the number of DT records in the Mi set, 
b\ + b 2 + . . . + b K = M. Vector B describes the geographic distribution of DT data over all the 
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K regions. And cluster center matrix is expressed as 



C = [cic 2 ...c k ...c K ] IQxK = 



(1) 



C/,1 



Cl,k 



C I,K 



where column vector c k of length IQ is the cluster center of the Mi cluster, while c,^ is a column 
vector of length Q, 



Its gth element c iA ^ is the matrix element at the [(?' - \)Q + g]th row and the Mi column of C, 



For the Mi cluster and neighboring cell i, c^t is the probability that CIRSP element equals 
to q, c it k is the probability distribution of the CIRSP elements over all the Q CIR intervals. For 
the Mi cluster, c* gives / probability distributions of the CIRSP elements over all the Q CIR 
intervals and corresponding to / neighboring cells. 

C. Regression and Traffic Distribution 

The purpose of the regression is to estimate the number of MMRs belonging to each cluster. 
Assume that there are N neighboring cells whose signal strength measurements have been 
contained in both MMRs and DT records, N < I and N < J. These ./V neighboring cells 
are called common neighboring cells. Picking out the elements corresponding to the common 
neighboring cells from vector R and matrix C, as the samples of the dependent variable and the 
explanatory variables for multiple linear regression model. 

Without loss of generality, we assume that the first N neighboring cells of both MMRs data 
and DT data are the common neighboring cells, with index n = 1 . . . N. Therefore there are I—N 
neighboring cells whose signal strength measurements are contained only in DT records, with 
cell index n' = N + 1, . . .,/. 

An augmented matrix is constructed as 



Ci,k - \CiXkCi,2,k ■ ■ ■ 




(2) 




(3) 



C = 



I C 



(4) 



ilQx(K+l) 
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where I is an all "1" vector of length IQ. Another augmented matrix C is constructed as 



C = 



(5) 

NQx(K+\) 



r c 

where I' is an all "1" vector of length NQ, C contains the top NQ rows of C and can be 
expressed as 

C = [c\c' 2 ...c' k ...c' K ] NQxK (6) 

thus vector c't contains the top NQ rows of Ct- Denote random error vector as 

e = [ei s 2 . . . s NQ ] J (7) 

where s\, e 2 , ■ ■ .,s NQ are i.i.d and obey normal distribution. Therefore, the multiple linear re- 
gression model can be formulated as 

R' = C'p + e (8) 

where R' is a column vector containing only the top NQ rows of R. 

Regression model (8) stands because MMRs data is actually a linear combination of the 
CIRSPs derived from DT records, if not considering the error resulted from the different mea- 
surement methods and tools. Therefore, MMRs data can be viewed approximately as a linear 
combination of the K cluster centers. The coefficients can be calculated from (8) and compose 
a (K + l)-dimensional vector 

$ = [p .../3 k .../3 K ] T (9) 

where /3 k is actually the estimate of the number of the MMRs reported from the Mi region. The 
constant fi is the estimate of the number of MMRs whose CIRSPs not included in DT records. 
fio is generally not zero since drive test might not go through the whole serving cell and thus 
lead to partially mismatch between the CIRSPs of MMRs and DT records. Apparently, P is an 
estimate of practical traffic distribution over different regions. 

D. Data Fusion and IM Generation 

Data fusion is done in the following way: the severe interfering neighboring cells omitted 
from MMRs are found and recovered from the DT data, and then complemented into MMRs 
data. A neighboring cell is a severe interfering neighboring cell if 

(^t hresh old K 

^ Y,PkCn', q ,k>Q (10) 
q=\ tc=l 
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where ^threshold is the preset CIR interval threshold as stated in Section II-C. If CIR value against 
a neighboring cell falls into intervals numbered from 1 to <2threshoid, this neighboring cell is a 
potential interference source because its signal is strong enough. If (10) is satisfied, neighboring 
cell n' is one of the 6 strongest neighboring cells, and the CIR against neighboring cell n' is 
lower than the CIR threshold in one or more regions. 

If neighboring cell n' satisfies (10) and its signal is only detected by drive test, it has been 
omitted from MMRs for some reason and needs to be complemented to MMRs data. 

Assume there are N' (N' < I - N) omitted severe interfering neighboring cells in total, then 
the corresponding MMRs data R" can be estimated by the regression model as 

R" = C"p (11) 

where C" is a matrix containing N'Q rows of the augmented matrix C , and these N'Q rows 
are related with the severe interfering neighboring cells which are omitted from MMRs. 

The fused MMRs data is then constructed from MMRs data R and the completed MMRs data 
R" as 



R = 



R 

R" 



(12) 



Therefore R is called reinforced MMRs data. 

Finally, an IM is generated from R by using the way in Section II-C. This IM is named as 
IM-MR' for short. 

For simplicity, the IM generation algorithm proposed in this section is named as MMRs+DT 
algorithm. The fused source data is the reinforced MMRs data, which is derived by comple- 
menting MMRs data with the originally omitted severe interfering neighboring cells provided 
by DT data. 



IV. IM GENERATION ALGORITHM BASED ON DT DATA RESHAPED BY MMRS DATA 

IM can also be generated using DT data as stated in Section I. However, we found that the 
geographical distribution of DT data B = [b\b 2 ...bt ...b K ]ixK relies on the vehicle velocity 
and road situation, thus are useless to frequency planning/optimization. If the geographical 
distribution is replaced by the actual communication traffic distribution, then the generated IM 
will contain traffic information and result in more reasonable frequency planning/optimization. 
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The algorithm proposed in this section is named as DT+MMRs for short. The block diagram 
of this algorithm is shown in Figure 7. Data Fusion II is the only block different from that in 
Figure 4, which generates the reshaped DT data by changing DT data's distribution B to ji. The 
MMRs data and DT data in Figure 7 play different roles from those in Figure 4. The output of 
the algorithm is named IM-DT' to reflect that it is an IM generated from the reshaped DT data. 

Changing the distribution of DT data from B to P is achieved by replicating bt DT records 
to [fit] DT records for every set, where |_*J means the maximal integer no larger than x. The 
reshaping procedure for the Mi set is as follows: each DT record of the set is replicated E k = 
Ifik/bk] times, then F k = p k mod (bk) DT records are picked out from the set randomly, and thus 
totally E k bk + F k « /?* DT records are obtained. 

V. SIMULATION RESULTS 

Simulations were carried out to evaluate the effects of the proposed data fusions, and to 
make comparison between the IM generated from the fused data using our algorithms and that 
generated from the original data using traditional IM generation algorithms. 

MMRs and DT records were collected from a practical cellular system. The simulated area 
is shown in Figure 8 where base stations (BSs) are marked as five-pointed stars and the track 
of the drive test is marked as dots. Cell 1 in the top left of the figure is the serving cell. The 
simulation settings are given in Table 1. 

A. Clustering and Characteristic Regions 

The service signal of Cell 1 mainly covers two roads perpendicular to each other and the 
buildings along them. The drive test is executed along these two roads. All the obtained DT 
records are plotted in Figure 9 according to the longitudes and latitudes of the locations where 
the DT records are measured. By clustering analysis, the DT data of Cell 1 was clustered into 8 
clusters, and the DT records were classified into 8 sets. DT records belonging to different sets 
are plotted with different markers in Figure 9. It can be observed from Figure 9 that clusters and 
locations are strongly related since the DT records obtained from a geographical region with 
unique wireless environment have the same CIRSR 
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B. Regression and Traffic Distribution 

The explanatory variables c't (k = 1, 2, . . . , 8) enter the regression model in turn according to 
the order ranked by the partial correlation coefficient between and dependent variable R'. In 
this simulated regression model, explanatory variables entered in the order of c' 2 , c'i, c' 3 and 
c' 6 • The significance probability of the regression model is 0.0001 which is smaller than the 
preset significance level (0.05), indicating that the dependent variable is strongly related with all 
the explanatory variables entered the model. The coefficient of determination is 0.858, meaning 
that the well-done regression made MMRs data effectively decomposed by 8 cluster centers. The 
model coefficients are listed in Table 2, and they all passed the significance test. 

C. Data Fusion and Obtained IMs 

Data Fusion I — reinforcing MMRs data with originally omitted severe interfering neighboring 
cells. As to Cell 1, the solid five-pointed stars in Figure 8 represent the BS sites of the neighboring 
cells with the interfering frequencies. After applying Data Fusion I to Cell 1, three severe 
interfering neighboring cells omitted originally are recovered and their BS sites are marked as 
hollow five-pointed star in Figure 8. MMRs data of these three neighboring cells is calculated 
using (11) £and combined with the original MMRs data to form the reinforced MMRs data using 
(12). The obtained IM-MR' is surely different from the IM-MR obtained from the original data 
using traditional algorithms. 

Data Fusion II — reshaping the distribution of DT data. The reshaped DT data is obtained 
by DT+MMRs algorithm. The cumulative probability distributions of the original DT data and 
the reshaped DT data are illustrated in Figure 10 respectively. It can be found that the elements 
of IM generated from the reshaped DT data (marked as DT' data in Figure 10) is different from 
those of IM generated from DT data. 

It is expectable that the similarity of IM-DT' and IM-MR is higher than that of IM-DT and 
IM-MR, since IM-DT' has traffic information and thus is more close to IM-MR. For the cells 
in Figure 8, calculating the correlation between IM-DT' and IM-MR, and that between IM-DT 
and IM-MR, the obtained correlation coefficients are 0.608 and 0.549 respectively. This verifies 
that the proposed Data Fusion II does provide useful traffic information. 

IM generated from fused data. The IM of the simulation area is generated as IM-MR' and 
IM-DT' from reinforced MMRs data and reshaped DT data respectively. The Pearson correlation 
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coefficient between IM-MR' and IM-DT' is 0.613, therefore these two IMs are highly correlated. 
It is reasonable because they are both the descriptions of the same wireless environment. 

VI. CONCLUSIONS 

DT records and MMRs describe the intra-system interference from the views of geographical 
coverage and customer experience respectively. Both of them have advantages and disadvantages. 
They can be fused to obtain an accurate IM with complete information. 

In this paper, two IM generation algorithms based on source data fusion are proposed. The 
MMRs+DT algorithm is executed mainly based on MMRs data while DT data is used to reinforce 
the frequency-domain information of MMRs data. On the contrary, the DT+MMRs algorithm 
is mainly based on DT data while MMRs data are used to provide traffic distribution. IM-MR' 
and IM-DT' generated from the fused source data are more accurate theoretically than IM-MR 
and IM-DT generated from the original source data, respectively. 

The simulation results show that the CIRSP is the "fingerprint" of position as the clusters match 
with the regions, MMRs data and DT data can be classified and matched according to CIRSR 
DT data from a region can be used to recover the severe interfering neighboring cells omitted 
in the MMRs data from the same region. Geographical distribution of DT data over regions can 
be reshaped to practical traffic distribution. IM-MR' and IM-DT' are highly correlated so that 
each of them can be used for better frequency planning/optimization. 

Since the source data is easy to obtained in conventional ways and no more assistant resources 
such as 3-dimensional digital map are needed, the proposed algorithms are applicable for practical 
engineering. 

The proposed technique also provides a reference to 3G and future systems' frequency plan- 
ning/ optimization . 
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Fig. 1. An example of MMRs data. 
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The procedure of obtaining ICDM element generated from MMRs data. 
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Fig. 4. The procedure of MMRs+DT algorithm. 



TABLE I 

Simulation settings 



DT data (cell 1) 


M=609, 7=123 


MMRs data (cell 1) 


g=10, 7=103 


clustering 


K-means clustering with K=8 


regression 


Stepwise regression [23]. 
Significance level for variable to enter model is 0.05 and 
that for variable to be removed is 0.10. 
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Fig. 5. Measurement positions and CIRSPs. 
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TABLE II 

The coefficients of regression 





P2 


/Si 






31391.563 


1377763.572 


1724748.834 


842328.720 


182019.175 
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Fig. 7. The procedure of DT+MMRs algorithm. 
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Fig. 8. Geographical positions of the cells in the simulated area. 
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Fig. 9. 8 sets of DT records belong to 8 regions in serving cell respectively. 
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Fig. 10. The cumulative probability distributions of the CIR against a neighboring cell. 



